Multivariate Calculus Vector Functions of Several Variables A vector-valued function of several variables is a function
f : R m → R n f: \mathbb{R}^{m} \rightarrow \mathbb{R}^{n} f : R m → R n
i.e. a function of m m m -dimensional vectors, which returns n n n dimensional vectors.
Examples A real valued function of many variables: f : R 3 → R f: \mathbb{R}^3\to\mathbb{R} f : R 3 → R , f ( x 1 , x 2 , x 3 ) = 2 x 1 + 3 x 2 + 4 x 3 f(x_1,x_2,x_3)=2x_1+3x_2+4x_3 f ( x 1 , x 2 , x 3 ) = 2 x 1 + 3 x 2 + 4 x 3 .
f f f is linear and f ( x ) = A x f(x)=Ax f ( x ) = A x where
x = ( x 1 x 2 x 3 ) x = \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} x = ⎝ ⎛ x 1 x 2 x 3 ⎠ ⎞ and
A = [ 2 3 4 ] A = \begin{bmatrix} 2 & 3 & 4 \end{bmatrix} A = [ 2 3 4 ] Let
f : R 2 → R 2 f: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2} f : R 2 → R 2
where
f ( x 1 , x 2 ) = ( x 1 + x 2 x 1 − x 2 ) f(x_1,x_2) = \left( \begin{array}{c} x_1 + x_2 \\ x_1 - x_2 \end{array} \right) f ( x 1 , x 2 ) = ( x 1 + x 2 x 1 − x 2 ) Note that f ( x ) = A x f(x)=Ax f ( x ) = A x , where
A = [ 1 1 1 − 1 ] A = \begin{bmatrix} 1 & 1 \\ 1 & -1 \\ \end{bmatrix} A = [ 1 1 1 − 1 ] Let
f : R 3 → R 4 f: \mathbb{R}^{3} \rightarrow \mathbb{R}^{4} f : R 3 → R 4
be defined by
f ( x ) = ( x 1 + x 2 x 1 − x 3 y − z x 1 + x 2 + x 3 ) f(x) = \left( \begin{array}{c} x_1 + x_2 \\ x_1 - x_3 \\ y - z \\ x_1 + x_2 + x_3 \end{array} \right) f ( x ) = ⎝ ⎛ x 1 + x 2 x 1 − x 3 y − z x 1 + x 2 + x 3 ⎠ ⎞ f ( x ) = A x f(x) = Ax f ( x ) = A x
where
A = [ 1 1 0 1 0 − 1 0 1 − 1 1 1 1 ] A = \begin{bmatrix} 1 & 1 & 0 \\ 1 & 0 & -1 \\ 0 & 1 & -1 \\ 1 & 1 & 1 \end{bmatrix} A = ⎣ ⎡ 1 1 0 1 1 0 1 1 0 − 1 − 1 1 ⎦ ⎤ These multi-dimensional functions do not have to be linear, for example the function f : R 2 → R 2 f:\mathbb{R}^2\to\mathbb{R}^2 f : R 2 → R 2
f ( x ) = ( x 1 ⋅ x 2 x 1 2 + x 2 2 ) f(x) = \left( \begin{array}{c} x_1 \cdot x_2 \\ x_1^{2} +x_2^{2} \end{array} \right) f ( x ) = ( x 1 ⋅ x 2 x 1 2 + x 2 2 ) is obviously not linear.
The Gradient Suppose the real valued function f : R m → R f:\mathbb{R}^m \rightarrow \mathbb{R} f : R m → R is differentiable in each coordinate.
Then the gradient of f f f , denoted ∇ f \nabla f ∇ f is given by
∇ f ( x ) = ( ∂ f ∂ x 1 , … , ∂ f ∂ x 1 ) \nabla f(x)=\begin{pmatrix}\displaystyle\frac{\partial f}{\partial x_1},&\dots &,\displaystyle\frac{\partial f}{\partial x_1}\end{pmatrix} ∇ f ( x ) = ( ∂ x 1 ∂ f , … , ∂ x 1 ∂ f )
Details Suppose the real valued function f : R m → R f:\mathbb{R}^m \rightarrow \mathbb{R} f : R m → R is differentiable in each coordinate.
Then the gradient of f f f , denoted ∇ f \nabla f ∇ f is given by
∇ f ( x ) = ( ∂ f ∂ x 1 , … , ∂ f ∂ x 1 ) \nabla f(x)= \begin{pmatrix} \displaystyle\frac{\partial f}{\partial x_1},&\dots &,\displaystyle\frac{\partial f}{\partial x_1}\end{pmatrix} ∇ f ( x ) = ( ∂ x 1 ∂ f , … , ∂ x 1 ∂ f )
where each partial derivative ∂ f ∂ x i \displaystyle\frac{\partial f}{\partial x_i} ∂ x i ∂ f is computed by differentiating f f f with respect to that variable, regarding the others as fixed.
Examples Let
f ( x ‾ ) = x 2 + y 2 + 2 x y f(\underline{x})= x^2+y^2+2xy f ( x ) = x 2 + y 2 + 2 x y
Then the partial derivatives of f f f are
∂ f ∂ x = 2 x + 2 y \displaystyle\frac{\partial f}{\partial x}=2x+2y ∂ x ∂ f = 2 x + 2 y
and
∂ f ∂ y = 2 y + 2 x \displaystyle\frac{\partial f}{\partial y}=2y+2x ∂ y ∂ f = 2 y + 2 x
and the gradient of f f f is therefore
∇ f = ( 2 x + 2 y , 2 y + 2 x ) \nabla f =\begin{pmatrix}2x+2y, & 2y+2x\end{pmatrix} ∇ f = ( 2 x + 2 y , 2 y + 2 x )
Let
f ( x ‾ ) = x 1 − x 2 f(\underline{x})=x_1-x_2 f ( x ) = x 1 − x 2
The gradient of f f f is
∇ f = ( 1 , − 1 ) \nabla f= \begin{pmatrix}1, & -1\end{pmatrix} ∇ f = ( 1 , − 1 )
The Jacobian Now consider a function f : R m → R n f:\mathbb{R}^m\to\mathbb{R}^n f : R m → R n .
Write f i f_i f i for the i t h i^{th} i t h coordinate of f f f , so we can write f ( x ) = ( f 1 ( x ) , f 2 ( x ) , … , f n ( x ) ) f(x)=(f_1(x),f_2(x),\ldots,f_n(x)) f ( x ) = ( f 1 ( x ) , f 2 ( x ) , … , f n ( x )) , where x ∈ R m x\in\mathbb{R}^m x ∈ R m .
If each coordinate function f i f_i f i is differentiable in each variable we can form the Jacobian matrix of f f f :
( ∇ f 1 ⋮ ∇ f n ) \begin{pmatrix} \nabla f_1 \\ \vdots \\ \nabla f_n \end{pmatrix} ⎝ ⎛ ∇ f 1 ⋮ ∇ f n ⎠ ⎞ Details Now consider a function f : R m → R n f:\mathbb{R}^m\to\mathbb{R}^n f : R m → R n .
Write f i f_i f i for the i t h i^{th} i t h coordinate of f f f , so we can write f ( x ) = ( f 1 ( x ) , f 2 ( x ) , … , f n ( x ) ) f(x)=(f_1(x),f_2(x),\ldots,f_n(x)) f ( x ) = ( f 1 ( x ) , f 2 ( x ) , … , f n ( x )) , where x ∈ R m x\in\mathbb{R}^m x ∈ R m .
If each coordinate function f i f_i f i is differentiable in each variable we can form the Jacobian matrix of f f f :
( ∇ f 1 ⋮ ∇ f n ) \begin{pmatrix} \nabla f_1 \\ \vdots \\ \nabla f_n \end{pmatrix} ⎝ ⎛ ∇ f 1 ⋮ ∇ f n ⎠ ⎞ In this matrix, the element in the i t h i^{th} i t h row and j t h j^{th} j t h column is ∂ f i ∂ x j \displaystyle\frac{\partial f_i}{\partial x_j} ∂ x j ∂ f i .
Examples For the function
f ( x , y ) = ( x 2 + y x y x ) = ( f 1 ( x , y ) f 2 ( x , y ) f 3 ( x , y ) ) f(x,y) = \begin{pmatrix} x^2 +y \\ x y \\ x \end{pmatrix} = \begin{pmatrix} f_1(x,y) \\ f_2(x,y) \\ f_3(x,y) \end{pmatrix} f ( x , y ) = ⎝ ⎛ x 2 + y x y x ⎠ ⎞ = ⎝ ⎛ f 1 ( x , y ) f 2 ( x , y ) f 3 ( x , y ) ⎠ ⎞ the Jacobian matrix of f f f is the matrix
J = [ ∇ f 1 ∇ f 2 ∇ f 3 ] = [ 2 x 2 y y x 1 0 ] J = \begin{bmatrix} \nabla f_1 \\ \nabla f_2 \\ \nabla f_3 \end{bmatrix} = \begin{bmatrix} 2x & 2y \\ y & x \\ 1 & 0 \end{bmatrix} J = ⎣ ⎡ ∇ f 1 ∇ f 2 ∇ f 3 ⎦ ⎤ = ⎣ ⎡ 2 x y 1 2 y x 0 ⎦ ⎤ Univariate Integration By Substitution If f f f is a continuous function and g g g is strictly increasing and differentiable then,
∫ g ( a ) g ( b ) f ( x ) d x = ∫ a b f ( g ( t ) ) g ′ ( t ) d t \displaystyle\int_{g(a)}^{g(b)} f(x)dx = \displaystyle\int_a^b f(g(t))g^\prime (t)dt ∫ g ( a ) g ( b ) f ( x ) d x = ∫ a b f ( g ( t )) g ′ ( t ) d t
Details If f f f is a continuous function and g g g is strictly increasing and differentiable then,
∫ g ( a ) g ( b ) f ( x ) d x = ∫ a b f ( g ( t ) ) g ′ ( t ) d t \displaystyle\int_{g(a)}^{g(b)} f(x)dx = \displaystyle\int_a^b f(g(t))g^\prime (t)dt ∫ g ( a ) g ( b ) f ( x ) d x = ∫ a b f ( g ( t )) g ′ ( t ) d t
It follows that if X X X is a continuous random variable with density f f f
and Y = h ( X ) Y = h(X) Y = h ( X ) is a function of X X X that has the inverse g = h − 1 g=h^{-1} g = h − 1 , so X = g ( Y ) X = g(Y) X = g ( Y ) , then the density of Y Y Y is given by,
f Y ( y ) = f ( g ( y ) ) g ′ ( y ) f_Y(y) = f (g(y)) g^\prime (y) f Y ( y ) = f ( g ( y )) g ′ ( y )
This is a consequence of
P [ Y ≤ b ] = P [ g ( Y ) ≤ g ( b ) ] = P [ X ≤ g ( b ) ] = ∫ − ∞ g ( b ) f ( x ) d x = ∫ − ∞ b f ( g ( y ) ) g ′ ( y ) d y P [Y \leq b] = P [g(Y) \leq g(b)] = P [X \leq g(b)] = \displaystyle\int_{- \infty} ^{g(b)}f(x)dx = \displaystyle\int_{- \infty} ^b f (g(y))g^\prime (y)dy P [ Y ≤ b ] = P [ g ( Y ) ≤ g ( b )] = P [ X ≤ g ( b )] = ∫ − ∞ g ( b ) f ( x ) d x = ∫ − ∞ b f ( g ( y )) g ′ ( y ) d y
Multivariate Integration By Substitution Suppose f f f is a continuous function f : R n → R f: \mathbb{R}^n \rightarrow \mathbb{R} f : R n → R and g : R n → R n g: \mathbb{R}^n \rightarrow \mathbb{R}^n g : R n → R n is a one-to-one function with continuous partial derivatives.
Then if U ⊆ R n U \subseteq \mathbb{R}^n U ⊆ R n is a subset,
∫ g ( U ) f ( x ) d x = ∫ U ( g ( y ) ) ∣ J ∣ d y \displaystyle\int_{g(U)} f(\mathbf {x})d\mathbf {x} = \displaystyle\int_{U}({g}(\mathbf {y}))|J|d\mathbf {y} ∫ g ( U ) f ( x ) d x = ∫ U ( g ( y )) ∣ J ∣ d y
where J J J is the Jacobian matrix and ∣ J ∣ |J| ∣ J ∣ is the absolute value of it's determinant.
J = ∣ [ ∂ g 1 ∂ y 1 ∂ g 1 ∂ y 2 ⋯ ∂ g 1 ∂ y n ⋮ ⋮ ⋯ ⋮ ∂ g n ∂ y 1 ∂ g n ∂ y 2 ⋯ ∂ g n ∂ y n ] ∣ = ∣ [ ∇ g 1 ⋮ ∇ g n ] ∣ J = \left| \begin{bmatrix} \displaystyle\frac{\partial g_1}{\partial y_1} & \displaystyle\frac{\partial g_1}{\partial y_2} & \cdots &\displaystyle\frac{\partial g_1}{\partial y_n} \\ \vdots & \vdots & \cdots & \vdots \\ \displaystyle\frac{\partial g_n}{\partial y_1} & \displaystyle\frac{\partial g_n}{\partial y_2} & \cdots & \displaystyle\frac{\partial g_n}{\partial y_n} \end{bmatrix}\right| = \left|\begin{bmatrix} \nabla g_1 \\ \vdots \\ \nabla g_n \end{bmatrix} \right| J = ∣ ∣ ⎣ ⎡ ∂ y 1 ∂ g 1 ⋮ ∂ y 1 ∂ g n ∂ y 2 ∂ g 1 ⋮ ∂ y 2 ∂ g n ⋯ ⋯ ⋯ ∂ y n ∂ g 1 ⋮ ∂ y n ∂ g n ⎦ ⎤ ∣ ∣ = ∣ ∣ ⎣ ⎡ ∇ g 1 ⋮ ∇ g n ⎦ ⎤ ∣ ∣ Details Suppose f f f is a continuous function f : R n → R f: \mathbb{R}^n \rightarrow \mathbb{R} f : R n → R and g : R n → R n g: \mathbb{R}^n \rightarrow \mathbb{R}^n g : R n → R n is a one-to-one function with continuous partial derivatives.
Then if U ⊆ R n U \subseteq \mathbb{R}^n U ⊆ R n is a subset,
∫ g ( U ) f ( x ) d x = ∫ U ( g ( y ) ) ∣ J ∣ d y \displaystyle\int_{g(U)} f(\mathbf {x})d\mathbf {x} = \displaystyle\int_{U}({g}(\mathbf {y}))|J|d\mathbf {y} ∫ g ( U ) f ( x ) d x = ∫ U ( g ( y )) ∣ J ∣ d y
where J J J is the Jacobian determinant and |J| is its absolute value.
J = ∣ [ ∂ g 1 ∂ y 1 ∂ g 1 ∂ y 2 ⋯ ∂ g 1 ∂ y n ⋮ ⋮ ⋯ ⋮ ∂ g n ∂ y 1 ∂ g n ∂ y 2 ⋯ ∂ g n ∂ y n ] ∣ = ∣ [ ∇ g 1 ⋮ ∇ g n ] ∣ J = \left| \begin{bmatrix} \displaystyle\frac{\partial g_1}{\partial y_1} & \displaystyle\frac{\partial g_1}{\partial y_2} & \cdots &\displaystyle\frac{\partial g_1}{\partial y_n} \\ \vdots & \vdots & \cdots & \vdots \\ \displaystyle\frac{\partial g_n}{\partial y_1} & \displaystyle\frac{\partial g_n}{\partial y_2} & \cdots & \displaystyle\frac{\partial g_n}{\partial y_n} \end{bmatrix}\right| = \left|\begin{bmatrix} \nabla g_1 \\ \vdots \\ \nabla g_n \end{bmatrix} \right| J = ∣ ∣ ⎣ ⎡ ∂ y 1 ∂ g 1 ⋮ ∂ y 1 ∂ g n ∂ y 2 ∂ g 1 ⋮ ∂ y 2 ∂ g n ⋯ ⋯ ⋯ ∂ y n ∂ g 1 ⋮ ∂ y n ∂ g n ⎦ ⎤ ∣ ∣ = ∣ ∣ ⎣ ⎡ ∇ g 1 ⋮ ∇ g n ⎦ ⎤ ∣ ∣ Similar calculations as in 28.4 give us that if X X X is a continuous multivariate random variable, X = ( X 1 , … , X n ) ′ X = (X_1, \ldots, X_n)^\prime X = ( X 1 , … , X n ) ′ with density f f f and Y = h ( X ) \mathbf{Y} = \mathbf{h} (\mathbf{X}) Y = h ( X ) , where h \mathbf{h} h is one-to-one with inverse g = h − 1 \mathbf g= \mathbf{h}^{-1} g = h − 1 .
So, X = g ( Y ) \mathbf{X} = g(\mathbf{Y}) X = g ( Y ) , then the density of Y \mathbf{Y} Y is given by;
f Y ( y ) = f ( g ( y ) ) ∣ J ∣ f_Y(\mathbf y) = f (g(\mathbf y)) |J| f Y ( y ) = f ( g ( y )) ∣ J ∣
Examples If Y = A X \mathbf{Y} = A \mathbf X Y = A X where A A A is an n × n n \times n n × n matrix with det ( A ) ≠ 0 \det(A)\neq0 det ( A ) = 0 and X = ( X 1 , … , X n ) ′ X = (X_1, \ldots, X_n)^\prime X = ( X 1 , … , X n ) ′ are independent and identically distributed random variables, then we have the following results.
The joint density of X 1 ⋯ X n X_1 \cdots X_n X 1 ⋯ X n is the product of the individual (marginal) densities,
f X ( x ) = f ( x 1 ) f ( x 2 ) ⋯ f ( x n ) f_X(\mathbf x)= f(x_1) f(x_2) \cdots f(x_n) f X ( x ) = f ( x 1 ) f ( x 2 ) ⋯ f ( x n )
The matrix of partial derivatives corresponds to ∂ g ∂ y \displaystyle\frac{\partial g}{\partial y} ∂ y ∂ g where X = g ( Y ) \mathbf X = \mathbf g(\mathbf{Y}) X = g ( Y ) , i.e. these are the derivatives of the transformation: X = g ( Y ) = A − 1 Y \mathbf X = g (\mathbf{Y}) = A^{-1}\mathbf{Y} X = g ( Y ) = A − 1 Y , or X = B Y \mathbf X = B \mathbf{Y} X = B Y where B = A − 1 B = A^{-1} B = A − 1
But if X = B Y \mathbf X = B \mathbf{Y} X = B Y , then
X i = b i 1 y 1 + b i 2 y 2 + ⋯ b i j y j ⋯ b i n y n X_i = b_{i1}y_1 + b_{i2}y_2 + \cdots b_{ij}y_j\cdots b_{in}y_n X i = b i 1 y 1 + b i 2 y 2 + ⋯ b ij y j ⋯ b in y n
So, ∂ x i ∂ y j = b i j \displaystyle\frac{\partial x_i}{\partial y_j} = b_{ij} ∂ y j ∂ x i = b ij and thus,
J = ∣ ∂ d x ∂ d y ∣ = ∣ B ∣ = ∣ A − 1 ∣ = 1 ∣ A ∣ J =\left|\displaystyle\frac{\partial d\mathbf x}{\partial d\mathbf y}\right| = |B| = |A^{-1}| = \displaystyle\frac {1}{|A|} J = ∣ ∣ ∂ d y ∂ d x ∣ ∣ = ∣ B ∣ = ∣ A − 1 ∣ = ∣ A ∣ 1
The density of Y \mathbf{Y} Y is therefore;
f Y ( y ) = f X ( g ( y ) ) ∣ J ∣ = f X ( A − 1 y ) ∣ A − 1 ∣ f_Y(\mathbf{y}) = f_X(g(\mathbf{y})) |J| = f_X(A^{-1}\mathbf{y}) |A^{-1}| f Y ( y ) = f X ( g ( y )) ∣ J ∣ = f X ( A − 1 y ) ∣ A − 1 ∣